Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update default settings #338

Merged
merged 7 commits into from
Oct 23, 2024
Merged

Update default settings #338

merged 7 commits into from
Oct 23, 2024

Conversation

JelteF
Copy link
Collaborator

@JelteF JelteF commented Oct 22, 2024

Fixes #217

This changes the defaults for duckdb.max_memory and duckdb.disabled_filesystems to the agreed upon settings. While doing that I found two issues that this also fixes:

  1. duckdb.disabled_filesystems = 'LocalFilesystem' was basically a way to make DuckDB execution unusable, due to us doing file system access when initializing the database/connection. So now we enable the setting later. Also we now don't call INSTALL {extension} during connection initialization anymore.
  2. Synchronise extensions before reconnect #316 didn't fix the loading of extensions when duckdb.install_extension() was called. Only for direct inserts to the duckdb.extensions table. That's fixed by not using the low-level CatalogTupleInsert, and instead switching to SPI.
  3. Postgres superusers should not be restricted by duckdb.disabled_filesystems, they should be able to do whatever they want. So now this setting is not actually configured for superusers anymore.

@JelteF
Copy link
Collaborator Author

JelteF commented Oct 22, 2024

@Y-- can you look into these test failures related to disabled_filesystems='LocalFilesystem' to see if they are fixable? We might have to only set disabled_filesystems while a user is executing a query or something.

@JelteF
Copy link
Collaborator Author

JelteF commented Oct 22, 2024

One other probably good change is to never set disabled_filesystems in duckdb for queries done by the postgres superuser, but that wouldn't apply to the changes I'm doing for: #271

So that's not really a real solution, just a related thing that we probably want to do anyway.

@Y-- Y-- requested a review from mkaruza October 22, 2024 08:54
wuputah
wuputah previously approved these changes Oct 22, 2024
@@ -3,3 +3,4 @@
shared_preload_libraries = 'pg_duckdb'
duckdb.force_execution = true
log_temp_files = -1
duckdb.disabled_filesystems = ''
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be necessary. If we need to do this in tests it means this is a bad default (at the moment). Having a default that breaks half of pg_duckdb half of the functionality completely, even for the superuser

I think there's two things we need to do:

  1. temporarily change duckdb_disabled_filesystems to '' when we know we want to interact with the filesystem. Like installing duckdb extensions, or when caching a parquet file.
  2. Make sure that for postgres superusers this setting is always set to ''. They can already access files on the filesystem in various ways, so there's no point in blocking them when executing queries in DuckDB. Also they could just change the duckdb.disabled_filesystems setting.

Those two should probably build on top of #342

@JelteF JelteF changed the base branch from main to pgspot-fixes October 23, 2024 09:24
values[Anum_duckdb_extension_enable - 1] = 1;

/* create heap tuple and insert into catalog table */
Relation duckdb_extensions_relation = relation_open(ExtensionsTableRelationId(), RowExclusiveLock);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ExtensionsTableRelationId still used somewhere now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ReadDuckdbExtensions

src/pgduckdb_options.cpp Show resolved Hide resolved
src/pgduckdb_duckdb.cpp Outdated Show resolved Hide resolved

/* inserting extension record */
HeapTuple new_tuple = heap_form_tuple(tuple_descr, values, nulls);
CatalogTupleInsert(duckdb_extensions_relation, new_tuple);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This low level API was not causing the trigger to be fired. So this PR actually also fixes a bug where the duckdb.install_extension() call would not update the sequence, and thus not existing cause connections to load the extension.

@JelteF JelteF changed the base branch from pgspot-fixes to non-superuser October 23, 2024 13:21
@JelteF
Copy link
Collaborator Author

JelteF commented Oct 23, 2024

Okay, I made the necessary changes to make these new defaults usable. It now depends on #342 though.

Copy link
Collaborator

@Y-- Y-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Y--
Y-- previously approved these changes Oct 23, 2024
Copy link
Collaborator

@Y-- Y-- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Base automatically changed from non-superuser to main October 23, 2024 14:53
@JelteF JelteF dismissed stale reviews from Y-- and wuputah October 23, 2024 14:53

The base branch was changed.

@JelteF JelteF enabled auto-merge (squash) October 23, 2024 15:15
@JelteF JelteF requested a review from Y-- October 23, 2024 15:15
@JelteF JelteF merged commit c531f72 into main Oct 23, 2024
4 checks passed
@JelteF JelteF deleted the default-settings branch October 23, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decide on default DuckDB configuration that makes sense for use in postgres
3 participants